ROI and tabSIMLR reproducibility in T1w and rsfMRI

overview

  • we use the SRPBS Traveling Subject MRI Dataset (here)

    • a traveling cohort : 9 healthy subjects travel to 12 sites to be imaged

    • of the 12 sites, 9 have consistently available T1w and rsfMRI in 6 subjects

    • sites represent variability in both MRI manufacturer and MRI model (high variability)

  • this enables us to investigate the reliability of our imaging-derived phenotypes (IDPs)

    • IDPs computed with ANTsPyMM (latest version)

    • we use the intraclass correlation coefficient (ICC) to assess consistency or reproducibility of the quantitative IDPs

  • the majority of IDPs show superior reliability

    • joint reduction of IDPs with SiMLR improves reliability further

ANTsPyMM IDPs derived from the same subjects imaged at different sites with MRI from various manufacturers show overall high reliability. This provides empirical evidence that multiple modality MRI may be used to derive quantitative phenotypes on which predictive models may be based.

background

ICC

see this paper 10.1016/j.jcm.2016.02.012 for discussion of ICC

Cicchetti (1994) gives the following often quoted guidelines for interpretation for kappa or ICC inter-rater agreement measures:

  • Less than 0.40—poor.
  • Between 0.40 and 0.59—fair.
  • Between 0.60 and 0.74—good.
  • Between 0.75 and 1.00—excellent.

A different guideline is given by Koo and Li (2016):

  • below 0.50: poor
  • between 0.50 and 0.75: moderate
  • between 0.75 and 0.90: good
  • above 0.90: excellent

reliability of T1w data

see this analysis of freesurfer on T1w showing values from 0.81 - 0.88

see this paper on T1w and rsfMRI

Analysis

T1Hier_resnetGrade is a deep learning based method that accurately predicts image quality in multi-site data.

Values range from 0 ( unusable ) to 3 ( best ) quality.

Demonstrate effect of subject vs effect of Scanner or Site

## Warning: Removed 1 rows containing missing values (`geom_point()`).

## Warning: Removed 1 rows containing missing values (`geom_point()`).

What is the variability of the measurement if we control for age alone?

mdl=(lm( outcome ~  age + Subject , data=dd ))
visreg::visreg( mdl, "Subject", gg=TRUE ) + 
    scale_color_brewer(palette = brewpal )

What is the reproducibility of the measurement if we control for age and scanner?

mdl=(lm( outcome ~  age + Subject + Scanner + 1 , data=dd ))
visreg::visreg( mdl, "Subject", gg=TRUE ) + 
    scale_color_brewer(palette = brewpal )

# grid.arrange( grobs = visreg::visreg( mdl, gg=TRUE ), ncol=1, main='reproducibility' )

What is the reproducibility of the measurement if we control for site?

mdl=(lm( outcome ~  age + Subject + Site + 1 , data=dd ))
visreg::visreg( mdl, "Subject", gg=TRUE ) + 
    scale_color_brewer(palette = brewpal )

# grid.arrange( grobs = visreg::visreg( mdl, gg=TRUE ), ncol=1, main='reproducibility' )

What is the effect of age if we control for the best confounds?

library(lme4)
mdl=(lm( outcome ~  age  + Scanner + T1Hier_resnetGrade +(Subject), data=dd ))
# mdl=(lm( outcome ~  age   +(Subject), data=dd ))
visreg::visreg( mdl, 'age', gg=TRUE, main="t=-7.4 vs t=-5.6" ) + 
    scale_color_brewer(palette = brewpal )

What is the reproducibility of the measurement if we control for motion?

mdl=(lm( outcome ~  age + Subject + Scanner + 1 , data=dd ))
grid.arrange( grobs = visreg::visreg( mdl, gg=TRUE ), ncol=1, main='reproducibility' )

ICC(2,k) (Two-Way Random, Absolute Agreement, Average Measures): This version is used when the raters (or measurement devices) are considered random samples from a larger population of raters and you want to generalize your findings to this broader context. It’s appropriate when different imaging sites might use different equipment or personnel, and you wish to assess the reliability of measurements across these variable conditions.

Both of these ICC types use the ‘k’ form, which means that the reliability is assessed based on the average of multiple measurements (in this case, multiple imaging sessions or sites), which generally provides a more robust and stable estimate of reliability.

Summmary reliability data by Scanner

myrel = reli_stats( "outcome", "Scanner", "Subject", data=dd )
print(  myrel)
## 
## Coefficient of Variation (%):  13.1
## Standard Error of Measurement (SEM):  0.0278
## Standard Error of the Estimate (SEE):  0.0377
## Standard Error of Prediction (SEP):  0.0758
## 
## Intraclass Correlation Coefficients with  95 % C.I.
##            Model         Measures  Type    ICC Lower CI Upper CI
## 1 one-way random        Agreement  ICC1 0.3138  0.06440   0.6978
## 2 two-way random        Agreement  ICC2 0.3181  0.07220   0.6987
## 3  two-way fixed      Consistency  ICC3 0.3282  0.06934   0.7097
## 4 one-way random   Avg. Agreement ICC1k 0.6957  0.25603   0.9203
## 5 two-way random   Avg. Agreement ICC2k 0.6999  0.28009   0.9206
## 6  two-way fixed Avg. Consistency ICC3k 0.7095  0.27142   0.9244
plot( myrel )

Summmary reliability data by Site

myrel = reli_stats( "outcome", "Site", "Subject", data=dd )
print(  myrel)
## 
## Coefficient of Variation (%):  13
## Standard Error of Measurement (SEM):  0.0276
## Standard Error of the Estimate (SEE):  0.0507
## Standard Error of Prediction (SEP):  0.102
## 
## Intraclass Correlation Coefficients with  95 % C.I.
##            Model         Measures  Type    ICC Lower CI Upper CI
## 1 one-way random        Agreement  ICC1 0.3177   0.1255   0.6720
## 2 two-way random        Agreement  ICC2 0.3202   0.1296   0.6726
## 3  two-way fixed      Consistency  ICC3 0.3311   0.1328   0.6842
## 4 one-way random   Avg. Agreement ICC1k 0.8074   0.5636   0.9486
## 5 two-way random   Avg. Agreement ICC2k 0.8092   0.5726   0.9487
## 6  two-way fixed Avg. Consistency ICC3k 0.8167   0.5794   0.9512
plot( myrel )

Site-wise reliability: pairwise comparison rsfMRI_DefaultMode_2_DefaultMode

  • two-way agreement average

Summmary reliability data by Site

## 
## Coefficient of Variation (%):  5.51
## Standard Error of Measurement (SEM):  141
## Standard Error of the Estimate (SEE):  387
## Standard Error of Prediction (SEP):  591
## 
## Intraclass Correlation Coefficients with  95 % C.I.
##            Model         Measures  Type    ICC Lower CI Upper CI
## 1 one-way random        Agreement  ICC1 0.7070   0.5008   0.9033
## 2 two-way random        Agreement  ICC2 0.7090   0.5017   0.9039
## 3  two-way fixed      Consistency  ICC3 0.7550   0.5614   0.9224
## 4 one-way random   Avg. Agreement ICC1k 0.9560   0.9003   0.9882
## 5 two-way random   Avg. Agreement ICC2k 0.9564   0.9006   0.9883
## 6  two-way fixed Avg. Consistency ICC3k 0.9652   0.9201   0.9907
## 
## [1] "T1Hier_vol_r_ivcerebellum"
## 
## Coefficient of Variation (%):  4.06
## Standard Error of Measurement (SEM):  0.0352
## Standard Error of the Estimate (SEE):  0.119
## Standard Error of Prediction (SEP):  0.182
## 
## Intraclass Correlation Coefficients with  95 % C.I.
##            Model         Measures  Type    ICC Lower CI Upper CI
## 1 one-way random        Agreement  ICC1 0.4272   0.2108   0.7560
## 2 two-way random        Agreement  ICC2 0.4521   0.2152   0.7699
## 3  two-way fixed      Consistency  ICC3 0.7425   0.5442   0.9176
## 4 one-way random   Avg. Agreement ICC1k 0.8704   0.7063   0.9654
## 5 two-way random   Avg. Agreement ICC2k 0.8813   0.7116   0.9679
## 6  two-way fixed Avg. Consistency ICC3k 0.9629   0.9148   0.9901
## 
## [1] "T1Hier_thk_LRAVG_transverse_temporaldktcortex"
## boundary (singular) fit: see help('isSingular')
## 
## Coefficient of Variation (%):  -70.4
## Standard Error of Measurement (SEM):  0.071
## Standard Error of the Estimate (SEE):  0
## Standard Error of Prediction (SEP):  0.243
## 
## Intraclass Correlation Coefficients with  95 % C.I.
##            Model         Measures  Type      ICC Lower CI Upper CI
## 1 one-way random        Agreement  ICC1 -0.01453 -0.07341   0.2009
## 2 two-way random        Agreement  ICC2  0.00000 -0.05776   0.2106
## 3  two-way fixed      Consistency  ICC3  0.00000 -0.06688   0.2345
## 4 one-way random   Avg. Agreement ICC1k -0.14799 -1.60085   0.6934
## 5 two-way random   Avg. Agreement ICC2k  0.00000 -0.96643   0.7060
## 6  two-way fixed Avg. Consistency ICC3k  0.00000 -1.29460   0.7338
## 
## [1] "rsfMRI_DorsalAttention_2_Subcortical"
## 
## Coefficient of Variation (%):  11.1
## Standard Error of Measurement (SEM):  0.041
## Standard Error of the Estimate (SEE):  0.0999
## Standard Error of Prediction (SEP):  0.174
## 
## Intraclass Correlation Coefficients with  95 % C.I.
##            Model         Measures  Type    ICC Lower CI Upper CI
## 1 one-way random        Agreement  ICC1 0.3701   0.1647   0.7147
## 2 two-way random        Agreement  ICC2 0.3848   0.1828   0.7198
## 3  two-way fixed      Consistency  ICC3 0.4876   0.2603   0.7951
## 4 one-way random   Avg. Agreement ICC1k 0.8409   0.6396   0.9575
## 5 two-way random   Avg. Agreement ICC2k 0.8492   0.6681   0.9585
## 6  two-way fixed Avg. Consistency ICC3k 0.8954   0.7601   0.9722
## 
## [1] "rsfMRI_DorsalAttention_2_DorsalAttention"
## 
## Coefficient of Variation (%):  5.77
## Standard Error of Measurement (SEM):  46.4
## Standard Error of the Estimate (SEE):  115
## Standard Error of Prediction (SEP):  196
## 
## Intraclass Correlation Coefficients with  95 % C.I.
##            Model         Measures  Type    ICC Lower CI Upper CI
## 1 one-way random        Agreement  ICC1 0.4195   0.2044   0.7507
## 2 two-way random        Agreement  ICC2 0.4313   0.2187   0.7546
## 3  two-way fixed      Consistency  ICC3 0.5279   0.2981   0.8184
## 4 one-way random   Avg. Agreement ICC1k 0.8667   0.6981   0.9644
## 5 two-way random   Avg. Agreement ICC2k 0.8722   0.7158   0.9651
## 6  two-way fixed Avg. Consistency ICC3k 0.9096   0.7926   0.9759
## 
## [1] "T1Hier_vol_r_iiicerebellum"

Site-wise reliability

## Warning in sqrt(ICC3 * (1 - ICC3)): NaNs produced

## Warning in sqrt(ICC3 * (1 - ICC3)): NaNs produced

## Warning in sqrt(ICC3 * (1 - ICC3)): NaNs produced

## Warning in sqrt(ICC3 * (1 - ICC3)): NaNs produced

## Warning in sqrt(ICC3 * (1 - ICC3)): NaNs produced

## Warning in sqrt(ICC3 * (1 - ICC3)): NaNs produced

## Warning in sqrt(ICC3 * (1 - ICC3)): NaNs produced

## Warning in sqrt(ICC3 * (1 - ICC3)): NaNs produced

## Warning in sqrt(ICC3 * (1 - ICC3)): NaNs produced

## Warning in sqrt(ICC3 * (1 - ICC3)): NaNs produced
##                                             anat       icc
## 176             Frontoparietal_2_DorsalAttention 0.8564492
## 23                 vol_LRAVG_precentraldktcortex 0.9423100
## 40  thk_LRAVG_caudal_anterior_cingulatedktcortex 0.9593162
## 89                        vol_l_crus_icerebellum 0.9648682
## 75                   thk_mtg_rn_LRAVGdeep_cit168 0.9667763
## 13       vol_LRAVG_medial_orbitofrontaldktcortex 0.9687730

how does site impact image quality

  • higher scores are better

  • a few sites are lower quality

  • a few subjects exhibit lower quality (consistently)

raw ICC for ROI representation

how many ROIs fall in each category of ICC

Do SiMLR-derived IDPs improve reliability? training testing paradigm across sites

  • train simlr on each of top quality sites

    • association of T1-derived IDPs and rsfMRI IDPs

    • default parameters for SiMLR (regression and ICA)

  • test on each other site

    • just means projecting the IDPs onto the SiMLR bases for both T1 and rsfMRI
  • this lets us look at ICC in the simlr space

    • shows that SiMLR latent space (generally) improves ICC over raw ROI representation
## [1] "USE SITE:  COI"
## [1] "USE SITE:  HKH"
## [1] "USE SITE:  SWA"
## Warning in sqrt(ICC3 * (1 - ICC3)): NaNs produced
## [1] "USE SITE:  ATV"
## [1] "USE SITE:  ATT"
## [1] "USE SITE:  YC2"

raw simlr ICC values

simlr ICC interpretation

simple t-test between two representations

## [1] "T-test of SiMLR vs ROI representation: T1 IDPs"
## 
##  Welch Two Sample t-test
## 
## data:  myiccsimlr[t1sel2, "icc"] and myiccs[t1sel1, "icc"]
## t = 5.5797, df = 134.49, p-value = 1.276e-07
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  0.02090536 0.04386315
## sample estimates:
## mean of x mean of y 
## 0.9848731 0.9524888
## [1] "T-test of SiMLR vs ROI representation: rsfMRI IDPs"
## 
##  Welch Two Sample t-test
## 
## data:  myiccsimlr[rssel2, "icc"] and myiccs[rssel1, "icc"]
## t = 2.46, df = 95.498, p-value = 0.01569
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  0.0175138 0.1639362
## sample estimates:
## mean of x mean of y 
## 0.7691929 0.6784679

Summary tabSimlr vs ROIs: ICC IDP reliability in T1w

Summary tabSimlr vs ROIs: ICC IDP reliability in rsfMRI network connectivity

  • rsfMRI measurements are inter and intra network connectivity between the canonical functional networks

  • default mode

  • salience

  • frontoparietal task control

  • …